Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Improved automatic classification algorithm of software bug report in cloud environment

HUANG Wei, LIN Jie, JIANG Yu'e

Journal of Computer Applications 2016, 36 (5): 1212-1215. DOI: 10.11772/j.issn.1001-9081.2016.05.1212

Abstract （517）

PDF （705KB）（399）

Save

User-submitted bug reports are arbitrary and subjective. The accuracy of automatic classification of bug reports is not ideal. Hence it requires many human labors to intervention. With the bug reports database growing bigger and bigger, the problem of improving the accuracy of automatic classification of these reports is becoming urgent. A TF-IDF (Term Frequency-Inverse Document Freqency) based Naive Bayes (NB) algorithm was proposed. It not only considered the relationship of a term in different classes but also the relationship of a term inside a class. It was also implemented in distributed parallel environment of MapReduce model in Hadoop platform. The experimental results show that the proposed Naive Bayes algorithm improves the performance of F1 measument to 71%, which is 27 percentage points higher than the state-of-the-art method. And it is able to deal with massive amounts of data in distributed way by addding computational node to offer shorter running time and has better effective performance.

Reference | Related Articles | Metrics

Select

Local similarity detection algorithm for time series based on distributed architecture

LIN Yang, JIANG Yu'e, LIN Jie

Journal of Computer Applications 2016, 36 (12): 3285-3291. DOI: 10.11772/j.issn.1001-9081.2016.12.3285

Abstract （631）

PDF （1125KB）（481）

Save

The CrossMatch algorithm based on the idea of Dynamic Time Warping (DTW) can be used to solve the problems of local similarity between time series. However, due to the high complexity of time and space, large amounts of computing resources are required. Thus, it is almost impossible to be used for long sequences. To solve the above mentioned problems, a new algorithm for local similarity detection based on distributed platform was proposed. The proposed algorithm was a distributed solution for CrossMatch. The problem of insufficient computing resources including time and space requirement was solved. Firstly, the series should be splited and distributed on several nodes. Secondly, the local similarity of every node's own series was dealt with. Finally, the results would be merged and assembled in order to find the local similarity of series. The experimental results show that the accuracy between the proposed algorithm and the CrossMatch algorithm is similar, and the proposed algorithm uses less time. The improved distributed algorithm can not only solve the computation problem of long sequence of time series which can not be processed by a single machine, but also improve the running speed by increasing the number of parallel computing nodes.

Reference | Related Articles | Metrics